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On a Source-Coding Problem with 
Two Channels and Three Receivers 

By L. OZAROW 
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This paper treats the problem of communicating a memoryless 
unit-variance Gaussian source to three receivers. Two channels are 
available, each with a separate receiver. A third receiver has the 
outputs of both channels available. We obtain an expression for the 
simultaneously achievable distortions (mean-squared error). This 
problem applies to the following situation: Assume that high-quality 
reproduction of a source is desired at a single receiver which is 
connected to the source over a pair of links operating in parallel. 
Further assume that the links are unreliable in that either may fail, 
and that the source encoder is unaware of the failures. One can then 
ask how "robust" a system designed for this situation can be. That is, 
what are the limits on the fidelity achievable when both links are 
functioning if graceful degradation is required during the failure of 
either link? An inverse relation between performance in the two 
modes is obtained in the sense that, as performance in the presence 
of both links approaches its theoretical optimum, average distortion 
during failures becomes large. Conversely, if near- ideal performance 
during link failures is desired, then the distortion achieved when 
both links operate is far from its optimum value. 

I. INTRODUCTION 

Consider the following communication problem: An encoder is pre- 
sented with a sequence of source letters {Xk} drawn from alphabet 
3C. We assume the {X k ) are independent and identically distributed 
with probability mass function p(x) (or a probability density function 
if 9C is continuous). For each block of N letters {N arbitrary), two 
discrete encoder outputs /i(X) and /HX) are produced (X is a vector of 
N letters). The cardinalities of f\ and fi are limited by 

i log || /i(X) ||<=i2,-, 1=1,2, 
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where the base of the log is arbitrary, but taken to be e in the sequel. 
Then Ri is the maximum rate at which information can be conveyed 
over the ith channel, in nats per source letter. 

We assume the existence of three receivers which must estimate X 
using /i alone, f 2 alone, and both /i and f 2 . The three estimates, denoted 
by &i, &2. and £3 are N vectors in some reproducing alphabets ( 3C\, 
&2, £3) which in general may not coincide with each other or with 
3C. Distortions d\, d 2 , and da are incurred at the respective receivers 
according to 



d' = TrZ E[8i(X k ,X ik )], 
1* k-i 



» - 1, 2, 3, 



where &(•,-) is a nonnegative real- valued function defined on iTand 
3d. This configuration is summarized in Fig. 1. 

The case of only one receiver is the classical rate-distortion problem. 1 
Corresponding to the source statistics and the distortion measure, the 
rate-distortion function is defined by 

R(d) = mfI(X;X), 

where P d = {p(x\x): E[8(X, X)] < d) and I(X; £) is the mutual 
information between X and X. A forward coding theorem and its 
converse exist to the effect that for any d (for which Pd is nonempty) 
and any e > there is a block length N and a code with at most e R(d)N 
words such that X (an estimate of X determined by the encoder 
output) satisfies 

±XE[8(X k ,X k )]<d + e. 

Conversely, for any code with rate less than R(d), the distortion can 
be no smaller than d. An alternate way of stating this last fact is that 
if/(X;£)<Ni*, then 

i V E[8(X ki X k )]>d*, 

where R(d*) = R. 

A natural problem for the network of Fig. 1 is to characterize the set 
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Fig. 1 — The channel-splitting problem. 
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of achievable quintuples (Hi, R 2 ,d x ,d 2 , ds). Although this problem is 
as yet unsolved for arbitrary sources and distortion, we have obtained 
the solution for one important special case — that of a Gaussian source 
with squared error distortion. In this case, the source and reproduction 
alphabets are the real line, and 

8i(x, x) = (x- x)\ i - 1, 2, 3. 

The rate-distortion function for this source and distortion measure 
is given by Ref. 1, Theorem 4.3.2: 

R(d) =- log -j nats/source letter, (1) 

where ol is the variance of X, here assumed to be 1. 

As noted above, R{d) gives the minimum mutual information per 
letter required to reproduce souce X with average distortion d. R{d) 
may be inverted to yield the distortion-rate function [i.e., the solution 
to R(d*) = R] given by 

D(R) = e~ 2R , (la) 

which, from the converse to the rate-distortion theorem stated above, 
is the minimum average distortion achievable in representing N vector 
X by X, when the average mutual information between vectors X and 
& is less than or equal to NR. 

To obtain one obvious outer bound to the set of simultaneously 
achievable (R u R 2 , d u d 2 , d 3 ), observe that estimate X, is a function 
of fi(X) for i = 1, 2, and that X 3 is a function of (/"i(X), f 2 (X)). Using 
the data-processing theorem, 1 we get 

/(X;Xi)</(X;A(X)) 

<#(/i(X)) 
<NRu 



Similarly, 



and 



Using (la) then, 



J(X;£ 2 )<iV7?2 

I(X;H 3 )^N(Ri + R 2 ). 

rfj > D(Rx) = exp(-2Rx), 

d 2 > exp(-2i? 2 ), 

d 3 > exp[-2(i?i + R 2 )]. (2) 
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In a single-destination problem, the forward part of the rate-distor- 
tion theorem implies that the distortion-rate function may be ap- 
proached arbitrarily closely for sufficiently large block lengths. If the 
same result applied here, then the inequalities in (2) could be replaced 
by approximate equalities. In particular, d 3 = d\d 2 as d\ and d 2 
approach their appropriate lower limits. We will show that this per- 
formance is not achievable. The actual set of achievable points is 
characterized by the following: 

Theorem 1: The achievable set of quintuples (Ri, R 2 , di, d 2 , d 3 ) for 
Fig. 1 is given by the set of points satisfying 

di > exp(-2#i) 
d 2 > exp(-2R 2 ), 

d 3 > exp[-2(fl 1 + R 2 )] - , (3) 

l-fVn- Va) 2 

where II = (1 - d,)(l - d 2 ) and A = did 2 - exp[- 2(R t + R 2 )]. 

Two simple examples will clarify the behavior of the region specified 
in the theorem. In the first example, set Ri = R 2 = R and assume that 
d\ = d 2 = d = e~ 2R . That is, the distortion obtained over each side 
channel is essentially on the appropriate rate-distortion curve. In this 
case, exp[-2(i?i + R 2 )] a d\d 2 = d 2 , A a and the last inequahty in 
(3) becomes 

^ ^ ^2 1 d 2 d 

d3 ~ d l-a-d) 2 ~2d=-dt-2=d' 

so that the achievable distortion over the joint channel is no better 
than half the distortion on the side channels. For any interesting (i.e., 
small) value of d, this is far worse than the value d 3 > d 2 obtained in 
(1). 

At the opposite extreme, assume that d 3 a exp[— 2(Ri + R 2 )]. That 
is, the encoder is designed to provide as good a performance as possible 
for the joint estimate. From the last inequality in (3), then 

i-(v^T-Va) 2 s i, 

which implies that 

n = A, 

(1 - di)(l - d 2 ) s d x d 2 - exp[-2(#, + R 2 )], 
1- di-d 2 + d x d 2 a did 2 - exp[-2(#i + R 2 )]. 
Therefore, 

o?i + d 2 a 1 + exp[-2(#i + R 2 )]. 
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Note that the value d = 1 can be obtained with no information, 
merely by always estimating X k by its mean. In this example, if either 
di or d 2 is small (not even necessarily near its rate-distortion bound), 
then the other side-channel distortion must be near 1. In other words, 
the latter estimate is virtually useless by itself. 

To account for these properties intuitively, note that the encodings 
which lead to /i(X) and / 2 (X) describe partitions of R N , which we 
denote by {A m )%2\ NRi) and {B n ) e n x »l NR2 \ so that knowledge of /I or f 2 
specifies whether X falls in A m or B n , and knowledge of both /i and f 2 
specifies where X falls, in C mn = A m n B n . The distortion achieved is 
then the moment of inertia of the corresponding set around its centroid. 

If rfi and d 2 are both small, then, on the average, the {A m } and {B n } 
are highly concentrated around their centroids. Therefore, each A m 
can intersect only a few B n , and the moment of inertia of the average 
Cmn can only be smaller than that of A m by a moderate fraction. 
Conversely, if d 3 is close to exp[-2(Ri + R 2 )], then the joint entropy 
of f\ and f 2 must be close to N{R t + R 2 ), which is no smaller than the 
sum of the individual entropies. Therefore, f and f 2 must be nearly 
independent. If this is true, then knowledge of A m must yield very little 
information about which B n X is in. In other words, the average A m 
must intersect essentially all the B n . Therefore, if d 2 is small, implying 
that the B n are concentrated, then since they must be distributed to 
cover R N , each A m must have content throughout R N , and its moment 
is large. 

We now prove the theorem. In the proof that follows, we use the 
converse to the source-coding theorem cited above, the notation h(Z) 
to denote the differential entropy of a continuous random vector Z 
(Ref. 1, p. 86), and the following two lemmas: 

Lemma 1: If a continuous random vector Z with N components has 
covariance matrix O, then the differential entropy h(Z) satisfies 

h(Z) < ^ log 2ire | * | l/N A Ng(\ * | i/N ), (4) 

where \&\ is the determinant of&. Furthermore, (4) is satisfied with 
equality if Z is Gaussian. In particular, if N = 1, then 

h(Z)<g{ol), 

where a\ is the variance ofZ. This lemma is proved in Ref. 1 (Theorem 
4.5.1). 

Lemma 2: Let VT— ► X — > Y be a Markov chain, where 

Y k = X k + Z k , k = l,N, 
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and {Zk) are independent of W. Then 

exp(^h(Y\W)\ > exJ^h(X\W)\ + exp(^h(Z)\ (5) 

In particular, if the {Z*} are independent and identically distributed 
(iid) Gaussian with variance a 2 , then 

exp ( 4 h(Y \W))> exp (( |- h(X \W))+ 2irea\ 



N y ■ 7~ ^ yv 

To prove Lemma 2, we note that the unconditional form is due to 
Blachman. 2 Inequality (5) then holds pointwise on W. Taking logs, 



^h(Y\W=w)>log 



exu( ^h(X\W = w)J + exp(^ h(Z) 



The function log (e* + k) is convex in x, so we can average both sides 
over W and preserve the direction of the inequality using Jensen's 
inequality. Exponentiating yields Lemma 2. 

II. CONVERSE PART OF THEOREM 1 

The mutual information between source block X and the joint- 
channel estimate of X, denoted by X 3 , satisfies the following inequality: 

/(X;£ 3 )</(X;/i(X)/- 2 (X)) 

<if(/i(X),/ 2 (X)) 

= H(MX)) + H(f 2 (X)) - 7(/i(X); f 2 (X)) 

= I(X; fx(X)) + I(X; f 2 (X)) - J(/i(X); / 2 (X)) 

< ^N(R l + R 2 )-I{f i (X)-f 2 {X)) 

<N(Ri + R 2 )-I{X.ilX.2), (6) 

where the steps labeled (a) follow from the data-processing theorem, 
(b) from the fact that /i and f 2 are determined by X, and (c) from the 
channel constraints. 

By the converse to the source-coding theorem, 

d 3 >Z)(lj(X;£ 3 )Y 
Using eq. (la) for D(R), we have 



d 3 >exp(-^/(X;£ 3 )) 



> expt-2^! + i? 2 )]expf^/(X i; X 2 ) j, (7) 

where the second inequality follows from (6). 
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We need now to lower-bound the second exponent in (7). To do this, 
we define an artificial random vector Y, formed by adding to X a zero- 
mean Gaussian vector Z, whose components are independent and have 
common variance e. Although Y is independent of Xi and £2, given X, 
and plays no apparent intuitive role in the encoding/decoding process, 
Y provides the crucial lower bound in the proof. 

It is true that 

fl2i;2 a Y)-i(fti;£i|Y)+iiati;Y) 

= 7(X 1 ;X 2 ) + /(X 1 ;Y|X 2 ). 

Therefore, 

7(X i; £ 2 ) - Jtfi; X 2 | Y) + Iflti; Y) - !(*,; Y |*„) 

>I0L 1 ;Y)-I<$. 1 ;Y\S. 2 ) 

= /(Xi; Y) + J(£ 2 ; Y) - /(X1X2; Y), (8) 

where the inequahty follows from the nonnegativity of mutual infor- 
mation, and all other steps follow from the identity I(A; BC) = I(A; 
B) + I(A; C\B). Now for i = 1, 2, 

i 2 E[tfik - Y k ) 2 ] = i J £[(*,* -** + **- Y*) 2 ] 

1 " 
- jy X [Eiftk ~ X k ) 2 + E(X* - Y k ) 2 ] 

= rf, + e, 

where the cross term vanishes since {Z*} are independent of all else. 
Also Y is a Gaussian vector with independent components, each of 
variance 1 + e. The rate-distortion function for Yis then given by (1): 

j^<d)-i]og!±i. 

So by the converse to the source-coding theorem, 

I/(£ i; Y)>iiog-^-. i=l,2. (9) 

N 2 di + e 

As for the last term in (8), 
/(XiX 2 ; Y) = MY) - /i(Y|£,£ 2 ) 

= Ng(l +€)- /i(Y|X 1 X 2 ) (Lemma 1) 



N 
<Ng(l + e)--\og 



exp(^ft(X|£i£ 2 )) + 2we€ 

(Lemma 2) 
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But 

MX | £,£ 2 ) = MX |XiX 2 ) - MX) + MX) 

= -I(X;X,X 2 )+MX) 

= MX) - H(X,X 2 ) + 2f(XiX 2 |X) 

= MX)-/f(X 1 ,X 2 ) 

= h(X) - H(Zy) - tf(X 2 ) + /(X,; X 2 ) 

= Afe(l) - H(Xx) - ff(X 2 ) + /(X i; X 2 ) 

(Lemma 1) 

> iV(#(l) - /2i - # 2 ) + /(Xi; X 2 ), (channel constraint) 

where step (a) follows from the fact that Xi and X 2 are determined by 
X. Therefore, 

Z(iiX 2 ;Y)<Afe(l + €) 



N ^ 
-ylog 



exp[2(#(l) - Ry - fl 2 )]exp( -I(X i; X 2 ) ) + 2iree 



»g/(X i; X 2 )) 



Since e w " = 2ire, 
/(X 1 X 2 ; Y) < Ng(l + e) 



-Ng 



exp[- 2{R X + « 2 )]exp^/(X i; X 2 ) j + e 



. (10) 



Combining (8), (9), and (10), and defining t A exp[2/(Xi; X 2 )/N], we 
have 



t> 



(l + £){exp[-2(iei + i2 2 )]^ + c} 



(c + di)(e + d 2 ) 

Isolating t, 

t{(e + di)(e + d 2 ) - (1 + €)exp[-2(^! + R 2 )]} > c(l + e) 

t(e 2 + e{di + d 2 - exp[-2(i?i + R 2 )]} + did 2 

- exp[-2(i*i + R 2 )]) > e(l + e). 

Since dt > exp(— 2Ri), the quadratic on the left-hand side is always 
nonnegative (as long as e is). Define A and II as in the statement of 
Theorem 1, so that 

e(l + e) 



t> 



(11) 



c?+ e(l + A - n) + A ' 
This inequality holds for any e > 0. In particular, choose that e 
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which maximizes the right-hand side. Taking derivatives and setting 
to zero, it is readily shown that the maximizing e is given by 

€ = . 

Vn- Va 

The numerator of (11) is 

cd + e) = -\ 1 + 



y/U->/K\ y/U-yfKj (VTT - VK) 2 

The denominator is 
e 2 + e(l + A-Ii) + A 

A VA 

= - + U + A-n) + A 

(Vn-VA) 2 (Vn-VK) 

1 

(VfT-v/A) 2 
.[a + >/A(>/n - >/a)(i + a - id + a (Vn - Va) 2 ] 
i 

(N/n->/A) 2 

.[A + vflA - A - v/A(v1l - >/A)(II - A) + A (\/n - v^) 2 ] 
1 

(ViT-v/A) 2 
.[VnA - Va(n/i! - VA^Vn + >/a) + A(>/n - n/a) 2 ] 

= - [VnA - VnA(Vn - Va) 2 ] 

(v^I- Va) 2 

= _VnA_ ti _ (7n _^ )2] 

(vfi-vft) 2 
Therefore, (11) becomes 



t> 



i-(v / n- >/a) 2 

Substituting into (7) yields the third inequality of Theorem 1. The first 
two inequalities in Theorem 1 are, of course, trivial. Theorem 1 
(converse) is proved. 
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III. FORWARD PART OF THEOREM 1 

To prove the forward theorem we evaluate the following achievable 
region for the general case of Fig. 1, found by El Carnal and Cover. 3 
Consider {Xk} drawn iid from alphabet 6C according to probability 
assignment p(x). Let & u 2t 2 , ^3 be the appropriate reproducing 
alphabets at the three receivers in Fig. 1, and let di (•,.), d 2 (-,-), 
<&(•,•)> be the respective (single-letter) distortion measures. Consider 
a test encoder of the form shown in Fig. 2. That is, let auxiliary random 
variables U E °U and V £ V be arbitrarily jointly distributed, given X. 
For any three decoding functions, 

g 2 : V-+ &2, 



average distortions 



di-E[di{X, gl lU))l 
d2-E[diiX,gz(V))l 

d 3 = E[d 3 (X, g s (U, V))l 



(12) 



are achievable if 



(13) 



Rt^KUiX), 

R 2 >I(V;X), 
R! + R 2 > I(UV; X) + I(U; V). 
Applying this result to our problem, let 

U = X+N U 

V = X + N 2 , 

where iVi and N 2 are jointly zero-mean Gaussian with covariance 
matrix 



' o\ 


0\0 2 p 




C\a 2 p 




ai _ 




u 















ENCODER 



T_C 



g 3 (u.v) 



g 2 (v) 



Fig. 2— Test encoder for achievable regions. 
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Oi is also the covariance matrix or U and V given X. Without the 
conditioning on X, U and V are jointly Gaussian with covariance 
matrix 

1 + a\ 1 + 01O2P 

1 + 0\02P 1 + 02 



<I>2 = 



Lemma 1 allows us to evaluate the right-hand side of (13) as 
I(U;X) = h(U)-h(U\X) 

1 . 1 + al 
= « lQ g— -2— . 

J(V;X)«ft(V)-A(V|X) 

1 . 1 + al 
= 2 l0g "^-' 

/(LTV; X) + I(U; V) = A(t/V) - h(UV\X) + h(U) + h(V) - h(UV) 
1 , (1 + a?)(l + al) 



= « lo g 



2 ° afald-p 2 ) 

Clearly, the best gi(u), gz(v), and g 3 (w, u) are the minimum mean- 
squared-error estimates of x, given the respective arguments. These 
are given by 

gi(u) = = w, 

U 2 

g2(v) = — V > 

V 2 
and 

, v v*-uv TT 2 -uv 

g 3 {u, v) = u + V. 

U 2 V 2 -(UV) 2 U 2 V 2 -(UV) 2 

Evaluating the various expressions and substituting into (12) yields 

*- ' ? 



1 + oV 



gfgf (i - p 2 ) 

a'ia'i(l - p 2 ) + <x? + a 2 - 2oi0 2 p 
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The constraints (13) become 



1, 1 

* lS -log-, 

We can therefore choose p arbitrarily, so long as 

did 2 (l - p 2 ) > exp[-2(#i + R 2 )l 

2 did 2 - exp[-2(.Ri + R 2 )] 
d\d 2 

Choose 

y/did 2 - exp[-2(i?i + R 2 )] 



P = 



y/d^d; 



Substituting this value for p, and using the fact that a 2 = di/(l — di) 
for i = 1, 2 [obtained from the first two parts of (14)], the last equation 
in (14) can be written as 

expt-2^! + R 2 )] 
d 3 = jr , (15) 

where 

D = exp[-2(#i + R 2 )] + di(l - d 2 ) + d 2 (l - di) 



+ 2Vl - di Vl-rf 2 Jdid 2 - exp[-2(i?i + R 2 )] 
= exp[-2(i?i + R 2 )] - dxd 2 + di + d 2 - d x d 2 

+ 2Vl - di Vl - d 2 >Jdid 2 -1 exp[-2(i?i + R 2 )] 
= - A-n + l + 2v / nA 

= i-(Vn- n/a) 2 , 

where II and A are as before. Equation (15) thus reduces to the last 
part of eq. (3), and Theorem 1 is proved. 

IV. CONCLUSIONS 

We have obtained the solution to the channel-splitting problem 
described in the introduction and depicted in Fig. 1, for the case where 
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the input letters are iid Gaussian and the distortion measure of interest 
is the mean-squared error. So far, no complete solution is known for 
any other source or distortion measure. Wolf et al. 4 have obtained an 
outer bound for the case of a binary symmetric source with Hamming 
(i.e., probability of error) distortion and have compared it in one case 
to the achievable region of Cover and El Gamal, but the bound exceeds 
the achievable point. Also, Witsenhausen 5 has considered a version of 
the binary problem and, in particular, has obtained, under slightly 
different assumptions, a stronger outer bound at one extreme point. 
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